: In this paper, the online consumer reviews were considered to assist purchase- decision making has become increasingly popular. To process the user reviews and find the useful information for making decision of purchase most of existing systems are presented. But one can hardly read all reviews to obtain a fair evaluation of a product or service. A subtask to be performed by such a framework would be to find the general aspect categories addressed in review sentences, for which this project presented two methods. The first method presented is an unsupervised method that applies association rule mining on co-occurrence frequency data obtained from a corpus to find these aspect categories. While not on par with state-of-the-art supervised methods, the proposed unsupervised method performs better than several simple baselines, a similar but supervised method, and a supervised baseline, with an F1-scoreof67%.Thesecondmethodisasupervisedvariantthatoutperformsexisting methods with an F1-score of 84%.
Introduction
Summary:
Data mining, also known as knowledge discovery in databases (KDD), involves extracting new and useful information from large datasets. Often, data saved for one purpose can later be analyzed to uncover additional valuable insights.
Sentiment Analysis and Electronic Word of Mouth (EWoM):
Word of Mouth (WoM) significantly influences consumer decisions. With the internet, electronic WoM (EWoM) via platforms like Twitter, Facebook, Amazon reviews, and Yelp has grown, providing rich customer feedback that impacts both buyers and businesses. Automated summarization of these reviews is needed due to the massive volume of data.
Aspect-Based Sentiment Analysis:
This technique identifies specific attributes (aspects) of products or services people talk about. Approaches include supervised (using labeled data) and unsupervised (without labeled data) methods.
Proposed Methods:
Unsupervised Method:
Uses spreading activation on a word co-occurrence graph to detect aspect categories. Seed words representing categories are used to mine association rules linking words to aspects. This method doesn’t require labeled training data.
Supervised Method:
Utilizes co-occurrence of lemmas and syntactic dependency relations with annotated categories to calculate conditional probabilities and assign aspect categories to sentences. This method relies on labeled training data.
Evaluation:
Both methods were tested on SemEval-2014 restaurant review datasets. The data shows many sentences have multiple categories and many aspect mentions are implicit (not directly stated). The supervised method achieves a high F1-score (~83%) by combining lemma and dependency indicators, outperforming the unsupervised method which struggles with abstract categories like "ambience."
Conclusion
The unsupervised method offers the advantage of no labeled data requirement but needs careful tuning of thresholds. The supervised method is more accurate but needs annotated data. Both contribute useful tools for analyzing consumer reviews and extracting actionable insights.
References
[1] P.F.Bone,“Word-of-moutheffectsonshort-term andlong-termproductjudgments,”J. Bus. Res., vol. 32, no.3, pp. 213–223, 1995.
[2] R.Feldman,“Techniquesandapplicationsforsentimentanalysis,”Commun.ACM, vol. 56, no. 4, pp. 82–89, 2013.
[3] S. Sen and D. Lerman, “Why are you telling me this? An examination into negative consumer reviews on the Web,” J. Interact. Marketing, vol. 21, no. 4, pp. 76–94,2007.
[4] B.BickartandR.M.Shindler,“Internetforums asinfluentialsourcesofconsumer information,” J. Consum. Res., vol. 15, no. 3, pp. 31–40, 2001.
[5] D. Smith, S. Menon, and K. Sivakumar, “Online peer and editorial recommendations, trust, and choice in virtual markets,” J. Interact. Marketing, vol. 19, no. 3, pp. 15–37, 2005.
[6] M. Trusov, R. E. Bucklin, and K. Pauwels, “Effects of word-of-mouth versus traditional marketing: Findings from an Internet social networking site,” J. Marketing, vol. 73, no. 5, pp. 90–102, 2009.
[7] M. T. Adjei, S. M. Noble, and C. H. Noble, “The influence of C2C communications in online brand communities on customer purchase behavior,” J. Acad. Marketing Sci., vol. 38, no. 5, pp. 634–653, 2010.
[8] B. Pang and L. Lee, “Opinion mining and sentiment analysis,” Found. Trends Inf. Retrieval, vol. 2, nos. 1–2, pp. 1–135, 2008.
[9] C.-L. Liu, W.-H. Hsaio, C.-H. Lee, G.-C. Lu, and E. Jou, “Movie rating and review summarization in mobile environment,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 42, no. 3, pp. 397–407, May 2012.
[10] M. Pontiki et al., “SemEval-2014 Task 4: Aspect based sentiment analysis,” in Proc. 8th Int. Workshop Semantic Eval. (SemEval), Dublin, Ireland, 2014, pp. 27–35.
[11] S. Kiritchenko, X. Zhu, C. Cherry, and S. M. Mohammad, “NRCCananda- 2014: Detecting aspects and sentiment in customer reviews,” in Proc. 8th Int. Workshop Semantic Eval. (SemEval), Dublin, Ireland, 2014, pp. 437–442.
[12] T. Brychcin, M. Konkol, and J. Steinberger, “UWB: Machine learning approach to aspect-based sentiment analysis,” in Proc. 8th Int. Workshop Semantic Eval. (SemEval), Dublin, Ireland, 2014, pp. 817–822.
[13] C. R. C. Brun, D. N. Popa, and C. Roux, “XRCE: Hybrid classification for aspect- based sentiment analysis,” in Proc. 8th Int. Workshop Semantic Eval. (SemEval), Dublin, Ireland, 2014, pp. 838–842.
[14] G. Castellucci, S. Filice, D. Croce, and R. Basili, “UNITOR: Aspect based sentiment analysis with structured learning,” in Proc. 8 Int. Workshop Semantic Eval.
[15] (SemEval), Dublin, Ireland, 2014, pp. 761–767.
[16] Z. Hai, K. Chang, and J.-J. Kim, “Implicit feature identification via co-occurrence association rule mining,” in Proc. 12th Int. Conf. Comput. Linguist. Intell. Text Process. (CICLing), Tokyo, Japan, 2011, pp. 393–404.
[17] K. Schouten and F. Frasincar, “Survey on aspect-level sentiment analysis,” IEEE Trans. Knowl. Data Eng., vol. 28, no. 3, pp. 813–830, Mar. 2016.
[18] Q. Su, K. Xiang, H. Wang, B. Sun, and S. Yu, “Using pointwise mutual information to identify implicit features in customer reviews,” in Computer Processing of Oriental Languages. Beyond the Orient: The Research Challenges Ahead (LNCS 4285), Y. Matsumoto, R. Sproat, K.-F. Wong, and M. Zhang, Eds. Berlin, Germany: Springer, 2006, pp. 22–30.
[19] Q. Su et al., “Hidden sentiment association in Chinese Web opinion mining,” in Proc. 17th Conf. World Wide Web (WWW), Beijing, China, 2008, pp. 959–968.
[20] X. Zheng, Z. Lin, X. Wang, K.-J. Lin, and M. Song, “Incorporating appraisal expression patterns into topic modeling for aspect and sentiment word identification,”
[21] W. Wang, H. Xu, and W. Wan, “Implicit feature identification via hybrid association rule mining,” Expert Syst. Appl. Int. J., vol. 40, no. 9, pp. 3518–3531, 2013.
[22] Y. Zhang and W. Zhu, “Extracting implicit features in Online customer reviews for opinion mining,” in Proc. 22nd Int. Conf. World Wide Web Companion (WWW Companion), 2013, pp. 103–104.
[23] G. Qiu, B. Liu, J. Bu, and C. Chen, “Opinion word expansion and target extraction through double propagation,” Comput. Linguist., vol. 37, no. 1, pp. 9–27, 2011.
[24] K. Schouten, F. Frasincar, and F. de Jong, “COMMIT-P1WP3: A co-occurrence based approach to aspect-level sentiment analysis,” in Proc. 8th Int. Workshop
[25] A. Garcia-Pablos, M. Cuadros, S. Gaines, and G. Rigau, “V3: Unsupervised generation of domain aspect terms for aspect based sentiment analysis,” in Proc. 8th Int. Workshop Semantic Eval. (SemEval), Dublin, Ireland, 2014, pp. 833–837.
[26] Z. Wu and M. Palmer, “Verbs semantics and lexical selection,” in Proc. 32nd Annu. Meeting Assoc. Comput. Linquistics, Las Cruces, NM, USA, 1994, pp. 133–138.
[27] [26]F. Crestani, “Application of spreading activation techniques in information retrieval,” Artif. Intell. Rev., vol. 11, no. 6, pp. 453–482, 1997.
[28] S. Bagchi, G. Biswas, and K. Kawamura, “Task planning under uncertainty using a spreading activation network,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. 30, no. 6, pp. 639–650, Nov. 2000.
[29] A. Katifori, C. Vassilakis, and A. Dix, “Ontologies and the brain: Using spreading activation through ontologies to support personal interaction,” Cognitive Syst. Res., vol. 11, no. 1, pp. 25–41, 2010.
[30] C. D. Manning et al., “The Stanford CoreNLP natural language processing toolkit,” in Proc. 52nd Annu. Meeting Assoc. Comput. Linguist. Syst. Demonstrations, 2014, pp. 55–60. [Online]. Available: http://www.aclweb.org/anthology/P/P14/P14-5010
[31] M.-C. de Marneffe and C. D. Manning, “Stanford typed dependencies manual,” Stanford NLP Group, Stanford University, Stanford, CA, USA, Tech. Rep., Sep. 2008. [Online]. Available: https://nlp.stanford.edu/software/dependencies_manual.pdf
[32] Y. Tang, Y.-Q. Zhang, N. V. Chawla, and S. Krasser, “SVMs modeling for highly imbalanced classification,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 39, no. 1, pp. 281–288, Feb. 2009.